A Novel P2P Traffic Identification Model Based on Ensemble Learning

نویسندگان

  • Ru-chuan WANG
  • He XU
  • Suo-ping WANG
چکیده

Peer-to-peer (P2P) traffic has occupied major fraction of all internet traffic. Hence, P2P flow identification becomes an important problem for network management. In our work, we propose an ensemble classification approach for P2P traffic identification, which integrates six DTNB(combination of naive Bayes and decision tables) algorithm and dynamic weighted integration method. The proposed P2P identification scheme can be divided into three stages. In the first stage, we use feature selection algorithm to extract P2P flow characteristics. In the second stage, we use DTNB algorithm to learning the pattern of P2P traffic characteristics. In the third stage, we use dynamic weighted integration method to increase the detection accuracy and reduce false positive in classification. To verify the performance of the proposed P2P identification based on ensemble classification, we collect network traffic traces from NJUPT campus using NETMATE, and run WEKA experiments. The experimental results show that the ensemble classification approach for P2P flow identification can achieve at an average of 97% accuracy rate and 4% false positive rate. Through experiment and giving comparisons of precision, true positive, false positive and ROC curve between the proposed ensemble method and traditional methods such as naive Bayes(NB) , decision trees(DT) and single DTNB algorithm, we find that the proposed method has a better P2P traffic identification accuracy and stability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory

The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...

متن کامل

A Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory

The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...

متن کامل

A Framework For Concept Drifting P2P Traffic Identification

Identification of network traffic using port-based or payload-based analysis is becoming increasing difficult with many Peer-to-Peer (P2P) application using dynamic ports, masquerading techniques, and encryption to avoid detection. To overcome this problem, several machine learning technique were proposed to classify P2P traffics. But in the real P2P network environment, new communities of peer...

متن کامل

P2P Network Traffic Identification Based on Random Forest Algorithm

With the rapid development of computer technique in the past decades, the emergence of P2P techniqueprompts the network computing model evolving from centralized network to distributed network. Although P2P technique has brought tremendous changes to the network technique, P2P technique also exposes a lot of problems during its implementation. If we can manage the P2P network traffic effectivel...

متن کامل

A Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets

Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011